Skip to content

fix(chunking): preserve sentence order in NlpSentenceChunking#1913

Merged
ntohidi merged 1 commit intodevelopfrom
fix/nlp-sentence-chunking-1909
Apr 16, 2026
Merged

fix(chunking): preserve sentence order in NlpSentenceChunking#1913
ntohidi merged 1 commit intodevelopfrom
fix/nlp-sentence-chunking-1909

Conversation

@ntohidi
Copy link
Copy Markdown
Collaborator

@ntohidi ntohidi commented Apr 11, 2026

Summary

Test plan

  • Verified sentence order is preserved with 10 ordered sentences
  • Verified duplicate sentences are no longer silently removed
  • Verified deterministic output across multiple runs

Remove broken re-import of load_nltk_punkt (already imported at module level).
Replace list(set(sens)) with plain return — set() destroyed document order
and silently dropped duplicate sentences.
@ntohidi ntohidi merged commit 4e86399 into develop Apr 16, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant